[Alerting V2] Register "assign episode" workflow trigger#268915
[Alerting V2] Register "assign episode" workflow trigger#268915adcoelho wants to merge 26 commits into
Conversation
|
Pinging @elastic/response-ops (Team:ResponseOps) |
96980ed to
2714f6f
Compare
…w-trigger-assign-episode
|
@macroscope-app review |
cnasikas
left a comment
There was a problem hiding this comment.
Tested and is working as expected!
| * Publishes an `episode.assigned` domain event for a persisted assign action. | ||
| * No-op when `assignee_uid` is null (unassign). | ||
| */ | ||
| emitEpisodeAssigned(request: KibanaRequest, action: AlertAction): void; |
There was a problem hiding this comment.
Is there any schema we can use to type the action for the assigned action?
|
Just a note that this PR also includes some changes from @cnasikas for the event bus which i reviewed and tested. |
| import { z } from '@kbn/zod/v4'; | ||
| import type { CommonTriggerDefinition } from '@kbn/workflows-extensions/common'; | ||
|
|
||
| export const EPISODE_ASSIGNED_TRIGGER_ID = 'alertingV2.episodeAssigned' as const; |
There was a problem hiding this comment.
Do we really need to specify V2 here? isn't that an implementation detail?
There was a problem hiding this comment.
We want to distinguish between the existing alerting and the new one (v2) in case we want to introduce triggers for the v1 alerting in the future. Any suggestions?
There was a problem hiding this comment.
don't have any suggestions atm, maybe @tinnytintin10 could help
There was a problem hiding this comment.
Discussed briefly this one with @tinnytintin10, suggestion is to remove V2 from name, as it also felt a bit as implementation detail. The next step will be for us to offer extra metadata for supporting V1 and V2 if you decide to ship something for alerting V1. Also you can explicitly put in the description that this one is related to V2 if that could help users further, wdyt?
There was a problem hiding this comment.
@tiamliu based on our recent conversations around how to cater to "mixed experiences" effectively, please check my thinking here?
I think we should call these kinds of events/triggers "alerting.[event_name]" and as much as possible make them version-agnostic. For events that apply to both, we can send them through the same flow, and include "version" as a payload discriminator for people who only want to handle one or the other for some reason (likely suggested by a consultant, support, etc).
Very rough examples:
for alerting.ruleSnoozed
payload might be
{
"rule.id": "some-id-for-rule",
"alerting.version": 2,
"timestamp": //...
// ...etc
}for alerting.alertAssigned and alerting.alertAcknowledged, maybe only v2 alerts have these features, but that's still okay.
Thoughts?
There was a problem hiding this comment.
Thank you both for the suggestions, and really appreciate being thorough here, as the trigger ID is not an internal name, but it is part of the public contract users put in their workflow YAML. For events that genuinely exist in both V1 and V2 with the same schemas, a common payload makes a lot of sense, and it will avoid users from having to configure the same workflow twice, one for V1 and one for V2. Where I would like to push back is that V2 has a totally different philosophy, for example, episodes. V1 has alert instances and recoveries with a different lifecycle, so for a lot of V2 events, there is no conceptually equivalent V1 event to merge with. For these V2 events, the version is always going to be 2, never anything else, and presenting a V2-only trigger as if it were version-agnostic is exactly the confusing mixed experience we're trying to prevent. We move the version from the trigger ID to the payload, which is still user-facing.
Some technical concerns when/if the payload diverges between versions on the same events: When we register a trigger, we need to provide a zod schema. We need to construct the schema with a discriminated union based on alerting.version having one schema per version. Workflow authors will have to know which fields to use and how to branch between different versions. We would also need to document which field is on which version. Also, if in the future we need to introduce versioning on the schema of the v2 payload (I hope we will never have to do this 🙂 ), we would need to have a second discriminator (alerting.v2.version) for the new schema. Separating the events based on its top level discriminator (the trigger ID) eliminates these issues, especially if each one's schemas want to evolve differently. Lastly, if we namespace on the trigger ID, it will be easier to deprecate v1 triggers in the future while we are deprecating the alerting v1, telemetry, and audit become a bit easier (not scanning the payload to figure out the version), and discoverability in the workflows UI is unambiguous (a user browsing their trigger catalog knows immediately which alerting world it belongs to).
On the leaking implementation detail in the trigger ID, I hear the concern, but all of our APIs already have the V2 in their path and are exposed to the users. I would suggest keeping the distinction of v1 and v2 in the trigger ID level (maybe rename it to alerting.v2.episodeAssigned or something very different) when the events are different, and keep the common event framework agnostic if the events are the same with the same payload.
Very happy to be wrong about specific cases, though, and if you got an event in mind where the shared-id pattern works even for V2-only features, I would love to talk it through.
|
@elasticmachine merge upstream |
|
|
||
| export const episodeAssignedPayloadSchema = z | ||
| .object({ | ||
| occurredAt: z.string().describe( |
| defaultMessage: 'ISO timestamp of when the assignment occurred.', | ||
| }) | ||
| ), | ||
| groupHash: z.string().describe( |
| defaultMessage: 'Stable hash of the alert grouping the episode belongs to.', | ||
| }) | ||
| ), | ||
| episodeId: z.string().describe( |
| defaultMessage: 'Identifier of the alerting episode whose assignee changed.', | ||
| }) | ||
| ), | ||
| ruleId: z.string().describe( |
| defaultMessage: 'Identifier of the alerting rule the episode belongs to.', | ||
| }) | ||
| ), | ||
| spaceId: z.string().describe( |
| actorUid: z | ||
| .string() |
| assigneeUid: z | ||
| .string() |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Module Count
Async chunks
Page load bundle
History
|
Closes #264977
Summary
assign.null.Testing
Create a workflow that uses the new trigger:
executiontab that it ran successfully